Clique-Based Lower Bounds for Parsing Tree-Adjoining Grammars
نویسندگان
چکیده
Tree-adjoining grammars are a generalization of context-free grammars that are well suited to model human languages and are thus popular in computational linguistics. In the tree-adjoining grammar recognition problem, given a grammar Γ and a string s of length n, the task is to decide whether s can be obtained from Γ. Rajasekaran and Yooseph’s parser (JCSS’98) solves this problem in time O(n), where ω < 2.373 is the matrix multiplication exponent. The best algorithms avoiding fast matrix multiplication take time O(n). The first evidence for hardness was given by Satta (J. Comp. Linguist.’94): For a more general parsing problem, any algorithm that avoids fast matrix multiplication and is significantly faster than O(|Γ|n) in the case of |Γ| = Θ(n) would imply a breakthrough for Boolean matrix multiplication. Following an approach by Abboud et al. (FOCS’15) for context-free grammar recognition, in this paper we resolve many of the disadvantages of the previous lower bound. We show that, even on constant-size grammars, any improvement on Rajasekaran and Yooseph’s parser would imply a breakthrough for the k-Clique problem. This establishes tree-adjoining grammar parsing as a practically relevant problem with the unusual running time of n, up to lower order factors.
منابع مشابه
Parsing Tree Adjoining Grammars With A Preprocessor
This paper presents a preprocessor based parsing system for Tree Adjoining Grammars. The preprocessor is used for two purposes: (1) to organize the data structures, (2) to reduce the runtime processing load so that the parser executes fast. A parallel parsing algorithm is presented that takes advantage of the preprocessor. The future goals of the proposed research are to achieve scalability and...
متن کاملNCE Graph Grammars and Clique-Width
Graph grammars are widely used in order to deene classes of graphs having some inductive and narrow structure. In many cases the narrowness can be measured in terms of the maximal tree-width and/or clique-width of the graphs in the class, (see RS86],,CO00] for deenitions of these notions). It is known that using the corresponding tree-decomposition or clique-width parse term, any property of th...
متن کاملTuLiPA: A syntax-semantics parsing environment for mildly context-sensitive formalisms
In this paper we present a parsing architecture that allows processing of different mildly context-sensitive formalisms, in particular Tree-Adjoining Grammar (TAG), Multi-Component Tree-Adjoining Grammar with Tree Tuples (TT-MCTAG) and simple Range Concatenation Grammar (RCG). Furthermore, for tree-based grammars, the parser computes not only syntactic analyses but also the corresponding semant...
متن کاملParsing Tree Adjoining Grammars and Tree Insertion Grammars with Simultaneous Adjunctions
A large part of wide coverage Tree Adjoining Grammars (TAG) is formed by trees that satisfy the restrictions imposed by Tree Insertion Grammars (TIG). This characteristic can be used to reduce the practical complexity of TAG parsing, applying the standard adjunction operation only in those cases in which the simpler cubic-time TIG adjunction cannot be applied. In this paper, we describe a parsi...
متن کاملLambek Grammars, Tree Adjoining Grammars and Hyperedge Replacement Grammars
Two recent extension of the nonassociative Lambek calculus, the LambekGrishin calculus and the multimodal Lambek calculus, are shown to generate class of languages as tree adjoining grammars, using (tree generating) hyperedge replacement grammars as an intermediate step. As a consequence both extensions are mildly context-sensitive formalisms and benefit from polynomial parsing algorithms.
متن کامل